Encapsulating Multiple Communication-Cost Metrics in Partitioning Sparse Rectangular Matrices for Parallel Matrix-Vector Multiplies

نویسندگان

  • Bora Uçar
  • Cevdet Aykanat
چکیده

This paper addresses the problem of one-dimensional partitioning of structurally unsymmetricsquare and rectangularsparse matrices for parallel matrix-vector and matrix-transpose-vector multiplies. The objectiveis to minimizethe communicationcost while maintainingthe balance on computational loads of processors. Most of the existing partitioning models consider only the total message volume hoping that minimizing this communication-cost metric is likely to reduce other metrics. However, the total message latency (start-up time) may be more important than the total message volume. Furthermore, the maximum message volume and latency handled by a single processor are also important metrics. We propose a two-phase approach that encapsulates all these four communication-cost metrics. The objective in the rst phase is to minimize the total message volume while maintainingthe computational-loadbalance. The objectivein the second phase is to encapsulate the remaining three communication-cost metrics. We propose communication-hypergraph and partitioning models for the second phase. We then present several methods for partitioning communication hypergraphs. Experiments on a wide range of test matrices show that the proposed approach yields very eeective partitioning results. A parallel implementation on a PC cluster veriies that the theoretical improvements shown by partitioning results hold in practice.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Partitioning Sparse Rectangular Matrices for Parallel Computations of Ax and ATv

This paper addresses the problem of partitioning the nonzeros of sparse nonsymmetric and nonsquare matrices in order to e ciently compute parallel matrix-vector and matrix-transpose-vector multiplies. Our goal is to balance the work per processor while keeping communications costs low. Although the symmetric partitioning problem has been well-studied, the nonsymmetric and rectangular cases have...

متن کامل

A Library for Parallel Sparse Matrix Vector Multiplies

We provide parallel matrix-vector multiply routines for 1D and 2D partitioned sparse square and rectangular matrices. We clearly give pseudocodes that perform necessary initializations for parallel execution. We show how to maximize overlapping between communication and computation through the proper usage of compressed sparse row and compressed sparse column formats of the sparse matrices. We ...

متن کامل

Minimizing Communication Cost in Fine-Grain Partitioning of Sparse Matrices

We show a two-phase approach for minimizing various communication-cost metrics in fine-grain partitioning of sparse matrices for parallel processing. In the first phase, we obtain a partitioning with the existing tools on the matrix to determine computational loads of the processor. In the second phase, we try to minimize the communicationcost metrics. For this purpose, we develop communication...

متن کامل

Improving performance of sparse matrix dense matrix multiplication on large-scale parallel systems

We propose a comprehensive and generic framework to minimize multiple and different volume-based communication cost metrics for sparse matrix dense matrix multiplication (SpMM). SpMM is an important kernel that finds application in computational linear algebra and big data analytics. On distributed memory systems, this kernel is usually characterized with its high communication volume requireme...

متن کامل

Partitioning Rectangular and Structurally Unsymmetric Sparse Matrices for Parallel Processing

A common operation in scientific computing is the multiplication of a sparse, rectangular, or structurally unsymmetric matrix and a vector. In many applications the matrix-transposevector product is also required. This paper addresses the efficient parallelization of these operations. We show that the problem can be expressed in terms of partitioning bipartite graphs. We then introduce several ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • SIAM J. Scientific Computing

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2004